HotMatch results for OEAI 2012
نویسندگان
چکیده
HotMatch is a multi-strategy matcher developed by a group of students at Technische Universität Darmstadt in the course of a hands-on training. It implements various matching strategies. The tool version submitted to OAEI 2012 combines different basic matching strategies, both element-based and structure-based, and a set of filters for removing faulty mappings. 1 Presentation of the system 1.1 State, purpose, general statement HotMatch1 has been developed by a group of students in the course of a semantic web hands-on training conducted at TU Darmstadt. The students were asked to develop and implement different matching algorithms. For OAEI 2012, we have combined a large number of those matching algorithms into one tool. To give an overview of our approaches, all matchers are depicted in figure 1. In contrast to matchers, filters are used to remove mapping elements found by previous matchers. 1.2 Specific techniques used HotMatch provides a library of different matching algorithms and filters. Matching Algorithms ElementStringMatcher is a simple string-based, element-level matcher on the element level. All labels, URI fragments and comments are extracted and tokenized. As a second step some stopwords are removed. To get a similary measure of two concepts, a cross product of labels, fragments and comments is calculated with the Damerau–Levenshtein distance. GraphbasedUseClassMatcher is a graph based matcher. It operates on the structural level and needs some input alignment to have an initial mapping between classes. Figure 2 gives an example of the mapping candidates. The properties X and Y are matched if the domain and range are equals respectively are aligned with a previous matcher. The confidence of the new mapping between the two Properties is the mean value between the confidence of mapping A to C and B to D. 1 For Hands-on training matcher Fig. 1. Overview on the matching and filtering algorithms implemented in HotMatch. Fig. 2. New mapping of GraphbasedUseClassMatcher. Class A and C as well as B and D are already matched. Property X and Y is therefore also matched. GraphbasedUsePropertyMatcher is a modification of GraphbasedUseClassMatcher. It uses properties from previous alignments instead of classes. If a property is matched from previous approaches, then the domain and range are also matched in a new alignment, inheriting the confidence mapping between the properties. SimilarityFlooding implements the structural similarity flooding matching algorithm described in [3]. FlowerMatcher is a matching algorithm which combines a structural and an elementbased approach. For each ontology class, its neighborhood (super and subclasses, properties that this class is a domain or range of) are regarded. From the names and labels of all the concepts in the neighborhood, a joint set of trigrams is computed. These sets are compared for determining the class similarity. ModelbasedMatcher checks currently only if the union of the two ontologies plus the input mappings is valid. The implementation uses the pellet reasoner. In the future, this matcher is supposed to add extra mappings derived by reasoning, as well discard mappings that generate a contradiction. DistributionSynonymMatcher and WikipediaCorpusMatcher are matchers using external resources, i.e., the online API lanes2. The distribution synonym matcher tries to identify synonyms based on distributional similarity, i.e., the similarity of the context in which two words occur [1]. The Wikipedia corpus matcher computes the percentage of Wikipedia pages on which two terms co-occur (similar to the approach discussed in [2]). SynonymMatcher uses the online thesaurus Big Huge Thesaurus3 to find mappings between concepts. Filters OriginalHostsFilter extracts the major host component of the input ontologies’ URIs. If an alignment has other URI hosts than the major one, this alignment is removed. The remaining mappings are not changed. This filter is necessary, because an alignment like < http : //purl.org/dc/elements/1.1/description, http : //purl.org/dc/elements/1.1/description, =, 1.0 > is definitely true, but not contained in the reference alignments. In OAEI tracks, it will thus generate a false positive and reduce the mathcher’s precision. CardinalityFilter is a filter to enforce a 1 : 1 alignment. If a resource from ontology one are matched to multiple resources from ontology two, then only the alignment with the highest confidence is selected. All other mappings are discarded. The same procedure is also applied for ontology two. The result of this filter is an alignment that relates each element from one ontology to at most one element from another ontology. ConfidenceFilter is a simple filter that removes all alignments that have a smaller confidence than a given threshold. 2 Language Analysis Essentials, http://research.wilsonwong.me/lanes.html 3 http://words.bighugelabs.com/ DomainRangeFilter discards all alignments with non-matched domain and range. This is particularly useful for discarding inverses (e.g., isReviewerOf vs. hasReviewer), which receive high similarity scores with simple element-based techniques. DatatypeRangeFilter checks only datatype properties. Matched properties hat have a different datatype (e.g., string vs. date) are discarded. SynonymFilter has been implemented as a variant of the SynonymMatcher (see above). Since the latter has shown to produce a too large number of false positives (but with reasonable recall), it can also be used as a filter, e.g., on structural approaches for improving precision. 1.3 Adaptations made for the evaluation The final matcher composition of the version submitted to OAEI 2012 is shown in figure 3. The threshold for confidence filter is set to t = 0.7. Note that not all matchers and filters discussed above are included in the final composition. We discarded all components that did not improve the system’s accuracy and favored faster components over slower ones in case of ties. All matchers are composed sequentially. The upper lane shows all matchers which generate new alignments. The lower one depicts all filters used to remove alignments that are not in the reference alignment to improve the precision value. Fig. 3. Final composition for the evaluation Although the filters only remove elements from the mapping generated by the matchers, they cannot be arbitrarily permuted. For example, the cardinality filter enforcing a 1:1 mapping will select the candidate with the highest threshold. If a mapping element with a higher threshold is filtered, e.g., by the OriginalHostsFilter, the selection will be different. Consider the following constellation for a mapping between ontology A and B, where B imports the FOAF ontology4: < A#person,B#author, =, 0.7 > (1) < A#person, foaf#person, =, 0.8 > (2) 4 http://xmlns.com/foaf/spec/ Using the CardinalityFilter first would discard the first element, and the second one would be discarded by the OriginalHostsFilter. On the other hand, using the OriginalHostsFilter first would discard the second element, with the first one passing the CardinalityFilter. 1.4 Link to the system and parameters file The tool version submitted to OAEI 2012 can be downloaded from http://www. ke.tu-darmstadt.de/resources/ontology-matching/hotmatch.
منابع مشابه
WikiMatch results for OEAI 2012
WikiMatch is a matching tool which makes use of Wikipedia as an external knowledge resource. The overall idea is to search Wikipedia for a given concept and retrieve all pages describing the term. If there is a large amount of common pages for two terms, then the concepts will have similar semantics. We make also use of the inter-language links between Wikipedias in different languages to match...
متن کاملWeSeE-Match results for OEAI 2012
WeSeE-Match is a simple, element-based ontology matching tool. Its basic technique is invoking a web search engine request for each concept and determining element similarity based on the similarity of the search results obtained. Multi-lingual ontologies are translated using a standard web based translation service. The results show that the approach, despite its simplicity, is competitive wit...
متن کاملHertuda results for OAEI 2012
Hertuda is a very simple element based matcher. It shows that tokenization and a string measure can also yield in good results. It is an improved version of the first version submitted to the OAEI 2011.5. 1 Presentation of the system 1.1 State, purpose, general statement Hertuda is a first idea of an element based matcher with a string comparison. It generates only homogeneous matchings, that a...
متن کاملEffects of different carriers for adsorption of I-125 on brachytherapy sources
Background: One of the key techniques for the preparation of 125I seeds is adsorption of 125I onto silver bits coated by palladium (pd). Carriers played an important role in the adsorption of 125I on palladium. KI is used as a carrier for fixing of 125I onto silver wire bits coated with palladium. Materials and Methods: Three procedures KI, KOH, NH4OH were investigated for adsorption of ...
متن کاملConvergence theorems of an implicit iteration process for asymptotically pseudocontractive mappings
The purpose of this paper is to study the strong convergence of an implicit iteration process with errors to a common fixed point for a finite family of asymptotically pseudocontractive mappings and nonexpansive mappings in normed linear spaces. The results in this paper improve and extend the corresponding results of Xu and Ori, Zhou and Chang, Sun, Yang and Yu in some aspects.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012